Goto

Collaborating Authors

 Ross Sea


A huge iceberg becomes a deadly trap for penguins

Popular Science

An iceberg sealed the penguin colony's entrance, triggering a 70% survival drop. A group of Emperor penguin chicks is walking on the fast ice at the Emperor penguin colony at Snow Hill Island in the Weddell Sea in Antarctica. Breakthroughs, discoveries, and DIY tips sent six days a week. A massive iceberg has triggered a catastrophic die-off of Emperor Penguin chicks in Antarctica, blocking thousands of parents from reaching their young. The event claimed the lives of approximately 14,000 chicks at the Coulman Island colony in the Ross Sea, the region's largest breeding ground.


A Self-Evolving AI Agent System for Climate Science

Guo, Zijie, Wang, Jiong, Ling, Fenghua, Wei, Wangxu, Yue, Xiaoyu, Jiang, Zhe, Xu, Wanghan, Luo, Jing-Jia, Cheng, Lijing, Ham, Yoo-Geun, Song, Fengfei, Gentine, Pierre, Yamagata, Toshio, Fei, Ben, Zhang, Wenlong, Gu, Xinyu, Li, Chao, Wang, Yaqiang, Chen, Tao, Ouyang, Wanli, Zhou, Bowen, Bai, Lei

arXiv.org Artificial Intelligence

Scientific progress in Earth science depends on integrating data across the planet's interconnected spheres. However, the accelerating volume and fragmentation of multi-sphere knowledge and data have surpassed human analytical capacity. This creates a major bottleneck for discovery, especially in climate science. To address this challenge, we introduce EarthLink, the first self-evolving AI agent system designed as an interactive "copilot" for Earth scientists. Through natural language interaction, EarthLink automates the entire research workflow by integrating planning, code execution, data analysis, and physical reasoning into a unified process that directly addresses this limitation. Beyond efficiency, it exhibits human-like cross-disciplinary analytical ability and achieves proficiency comparable to a junior researcher in expert evaluations on core large-scale climate tasks, including model-observation comparison and climate change understanding. When tasked with an open scientific problem, specifically the discovery of precursors of the Atlantic Niño, EarthLink autonomously developed a research strategy, identified sources of predictability, verified its hypotheses with available data, and proposed a physically consistent mechanism. These emerging capabilities enable a new human-AI research paradigm. Scientists can focus on value and result judgments, while AI systems handle complex data analysis and knowledge integration. This accelerates the pace and breadth of discovery in Earth sciences. The system is accessible at our website https://earthlink.intern-ai.org.cn.


RADAR: Benchmarking Language Models on Imperfect Tabular Data

Gu, Ken, Zhang, Zhihan, Lin, Kate, Zhang, Yuwei, Paruchuri, Akshay, Yu, Hong, Kazemi, Mehran, Ayush, Kumar, Heydari, A. Ali, Xu, Maxwell A., Narayanswamy, Girish, Liu, Yun, Poh, Ming-Zher, Yang, Yuzhe, Malhotra, Mark, Patel, Shwetak, Palangi, Hamid, Xu, Xuhai, McDuff, Daniel, Althoff, Tim, Liu, Xin

arXiv.org Artificial Intelligence

Language models (LMs) are increasingly being deployed to perform autonomous data analyses. However, their data awareness -- the ability to recognize, reason over, and appropriately handle data artifacts such as missing values, outliers, and logical inconsistencies -- remains underexplored. These artifacts are especially common in real-world tabular data and, if mishandled, can significantly compromise the validity of analytical conclusions. To address this gap, we present RADAR, a benchmark for systematically evaluating data-aware reasoning on tabular data. We develop a framework to simulate data artifacts via programmatic perturbations to enable targeted evaluation of model behavior. RADAR comprises 2980 table query pairs, grounded in real-world data spanning 9 domains and 5 data artifact types. In addition to evaluating artifact handling, RADAR systematically varies table size to study how reasoning performance holds when increasing table size. Our evaluation reveals that, despite decent performance on tables without data artifacts, frontier models degrade significantly when data artifacts are introduced, exposing critical gaps in their capacity for robust, data-aware analysis. Designed to be flexible and extensible, RADAR supports diverse perturbation types and controllable table sizes, offering a valuable resource for advancing tabular reasoning.


Cetvel: A Unified Benchmark for Evaluating Language Understanding, Generation and Cultural Capacity of LLMs for Turkish

Er, Yakup Abrek, Kesen, Ilker, Şahin, Gözde Gül, Erdem, Aykut

arXiv.org Artificial Intelligence

We introduce Cetvel, a comprehensive benchmark designed to evaluate large language models (LLMs) in Turkish. Existing Turkish benchmarks often lack either task diversity or culturally relevant content, or both. Cetvel addresses these gaps by combining a broad range of both discriminative and generative tasks ensuring content that reflects the linguistic and cultural richness of Turkish language. Cetvel covers 23 tasks grouped into seven categories, including tasks such as grammatical error correction, machine translation, and question answering rooted in Turkish history and idiomatic language. We evaluate 33 open-weight LLMs (up to 70B parameters) covering different model families and instruction paradigms. Our experiments reveal that Turkish-centric instruction-tuned models generally underperform relative to multilingual or general-purpose models (e.g. Llama 3 and Mistral), despite being tailored for the language. Moreover, we show that tasks such as grammatical error correction and extractive question answering are particularly discriminative in differentiating model capabilities. Cetvel offers a comprehensive and culturally grounded evaluation suite for advancing the development and assessment of LLMs in Turkish.


Modeling Heterogeneity across Varying Spatial Extents: Discovering Linkages between Sea Ice Retreat and Ice Shelve Melt in the Antarctic

Devnath, Maloy Kumar, Chakraborty, Sudip, Janeja, Vandana P.

arXiv.org Artificial Intelligence

Spatial phenomena often exhibit heterogeneity across spatial extents and in proximity, making them complex to model-especially in dynamic regions like ice shelves and sea ice. In this study, we address this challenge by exploring the linkages between sea ice retreat and Antarctic ice shelf (AIS) melt. Although atmospheric forcing and basal melting have been widely studied, the direct impact of sea ice retreat on AIS mass loss remains underexplored. Traditional models treat sea ice and AIS as separate systems. It limits their ability to capture localized linkages and cascading feedback. To overcome this, we propose Spatial-Link, a novel graph-based framework that quantifies spatial heterogeneity to capture linkages between sea ice retreat and AIS melt. Our method constructs a spatial graph using Delaunay triangulation of satellite-derived ice change matrices, where nodes represent regions of significant change and edges encode proximity and directional consistency. We extract and statistically validate linkage paths using breadth-first search and Monte Carlo simulations. Results reveal non-local, spatially heterogeneous coupling patterns, suggesting sea ice loss can initiate or amplify downstream AIS melt. Our analysis shows how sea ice retreat evolves over an oceanic grid and progresses toward ice shelves-establishing a direct linkage. To our knowledge, this is the first proposed methodology linking sea ice retreat to AIS melt. Spatial-Link offers a scalable, data-driven tool to improve sea-level rise projections and inform climate adaptation strategies.


Exploring the Potential of Latent Embeddings for Sea Ice Characterization using ICESat-2 Data

Han, Daehyeon, Karimzadeh, Morteza

arXiv.org Artificial Intelligence

The Ice, Cloud, and Elevation Satellite-2 (ICESat-2) provides high-resolution measurements of sea ice height. Recent studies have developed machine learning methods on ICESat-2 data, primarily focusing on surface type classification. However, the heavy reliance on manually collected labels requires significant time and effort for supervised learning, as it involves cross-referencing track measurements with overlapping background optical imagery. Additionally, the coincidence of ICESat-2 tracks with background images is relatively rare due to the different overpass patterns and atmospheric conditions. To address these limitations, this study explores the potential of unsupervised autoencoder on unlabeled data to derive latent embeddings. We develop autoencoder models based on Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN) to reconstruct topographic sequences from ICESat-2 and derive embeddings. We then apply Uniform Manifold Approximation and Projection (UMAP) to reduce dimensions and visualize the embeddings. Our results show that embeddings from autoencoders preserve the overall structure but generate relatively more compact clusters compared to the original ICESat-2 data, indicating the potential of embeddings to lessen the number of required labels samples.


Scalable Higher Resolution Polar Sea Ice Classification and Freeboard Calculation from ICESat-2 ATL03 Data

Iqrah, Jurdana Masuma, Koo, Younghyun, Wang, Wei, Xie, Hongjie, Prasad, Sushil K.

arXiv.org Artificial Intelligence

ICESat-2 (IS2) by NASA is an Earth-observing satellite that measures high-resolution surface elevation. The IS2's ATL07 and ATL10 sea ice elevation and freeboard products of 10m-200m segments which aggregated 150 signal photons from the raw ATL03 (geolocated photon) data. These aggregated products can potentially overestimate local sea surface height, thus underestimating the calculations of freeboard (sea ice height above sea surface). To achieve a higher resolution of sea surface height and freeboard information, in this work we utilize a 2m window to resample the ATL03 data. Then, we classify these 2m segments into thick sea ice, thin ice, and open water using deep learning methods (Long short-term memory and Multi-layer perceptron models). To obtain labeled training data for our deep learning models, we use segmented Sentinel-2 (S2) multi-spectral imagery overlapping with IS2 tracks in space and time to auto-label IS2 data, followed by some manual corrections in the regions of transition between different ice/water types or cloudy regions. We employ a parallel workflow for this auto-labeling using PySpark to scale, and we achieve 9-fold data loading and 16.25-fold map-reduce speedup. To train our models, we employ a Horovod-based distributed deep-learning workflow on a DGX A100 8 GPU cluster, achieving a 7.25-fold speedup. Next, we calculate the local sea surface heights based on the open water segments. Finally, we scale the freeboard calculation using the derived local sea level and achieve 8.54-fold data loading and 15.7-fold map-reduce speedup. Compared with the ATL07 (local sea level) and ATL10 (freeboard) data products, our results show higher resolutions and accuracy (96.56%).


Antarctica's 'Doomsday Glacier' is on the verge of COLLAPSING: Huge ice sheet the size of Great Britain could cause global sea levels to rise by 2 FEET, study warns

Daily Mail - Science & tech

The suspect in Charlie Kirk's assassination has been captured, FBI director Kash Patel announced MSNBC sparks outrage for'disgusting' Charlie Kirk comments following Utah shooting Tragedy as Charlie Kirk's wife left behind with two young children after conservative activist is fatally shot A DEI mayor, an inconvenient crime and video they never wanted you to see: MAUREEN CALLAHAN knows why the Left has sympathy for that killer... but none for his victim Sweater weather starts here - the cozy, chic pieces from Soft Surroundings you'll actually wear all season We only had one symptom we dismissed... but then we were diagnosed with the rarest form of melanoma Soft-touch prosecutor let felon walk free... before crook'slit Auburn professor's throat in random attack' I tried the 30 cent'miracle chill pill' before a big event.. now I'm taking it for everything Donald Trump and House Republicans lead prayers for Charlie Kirk's family after conservative star is fatally shot Prince Harry says his father King Charles is'great' following their first meeting in 19 months... which was over a cup of tea and just 55 minutes long Liberal media defends thug who killed Ukrainian woman in cold blood: 'This man was hurting' Knifeman accused of stabbing Ukrainian refugee to death gives chilling reason for the attack... as he speaks for the first time from jail on the murder that shocked America Fox News reveals new lineup and elevates star White House reporter who's sparred with Trump Horrific new details of passenger injuries after they were'thrown' around Delta flight during'severe turbulence' Antarctica's'Doomsday Glacier' is on the verge of COLLAPSING: Huge ice sheet the size of Great Britain could cause global sea levels to rise by 2 FEET, study warns READ MORE: 'Doomsday Glacier' melting'much faster' than previously thought With the potential to cause sea levels across the planet to rise, it's no wonder the Thwaites Glacier has earned the nickname the'Doomsday Glacier.' Now, scientists have revealed concerning findings about how and when the glacier could collapse. Researchers from the British Antarctic Survey (BAS) used underwater robots to take new measurements of the glacier, which is the same size as Great Britain. The data indicates that the Thwaites Glacier and much of the West Antarctic Ice Sheet could be lost entirely by the 23rd century. Worryingly, if it collapses entirely, the experts say global sea levels would rise by two feet (65cm) - plunging huge areas underwater. With the potential to cause seas across the planet to rise, it's no wonder the Thwaites Glacier has earned the nickname the'Doomsday Glacier' The Thwaites Glacier is roughly 74.5 miles (120km) across - the same size as Great Britain or Florida - making it the widest glacier on the planet Ice shelf connected to Antarctic's doomsday glacier is CRACKING The Thwaites Glacier is roughly 74.5 miles (120km) across - the same size as Great Britain or Florida.


Graph Neural Networks for Emulation of Finite-Element Ice Dynamics in Greenland and Antarctic Ice Sheets

Koo, Younghyun, Rahnemoonfar, Maryam

arXiv.org Artificial Intelligence

Although numerical models provide accurate solutions for ice sheet dynamics based on physics laws, they accompany intensified computational demands to solve partial differential equations. In recent years, convolutional neural networks (CNNs) have been widely used as statistical emulators for those numerical models. However, since CNNs operate on regular grids, they cannot represent the refined meshes and computational efficiency of finite-element numerical models. Therefore, instead of CNNs, this study adopts an equivariant graph convolutional network (EGCN) as an emulator for the ice sheet dynamics modeling. EGCN reproduces ice thickness and velocity changes in the Helheim Glacier, Greenland, and Pine Island Glacier, Antarctica, with 260 times and 44 times faster computation time, respectively. Compared to the traditional CNN and graph convolutional network, EGCN shows outstanding accuracy in thickness prediction near fast ice streams by preserving the equivariance to the translation and rotation of graphs.


Negative Label Guided OOD Detection with Pretrained Vision-Language Models

Jiang, Xue, Liu, Feng, Fang, Zhen, Chen, Hong, Liu, Tongliang, Zheng, Feng, Han, Bo

arXiv.org Artificial Intelligence

Out-of-distribution (OOD) detection aims at identifying samples from unknown classes, playing a crucial role in trustworthy models against errors on unexpected inputs. Extensive research has been dedicated to exploring OOD detection in the vision modality. Vision-language models (VLMs) can leverage both textual and visual information for various multi-modal applications, whereas few OOD detection methods take into account information from the text modality. In this paper, we propose a novel post hoc OOD detection method, called NegLabel, which takes a vast number of negative labels from extensive corpus databases. We design a novel scheme for the OOD score collaborated with negative labels. Theoretical analysis helps to understand the mechanism of negative labels. Extensive experiments demonstrate that our method NegLabel achieves state-ofthe-art performance on various OOD detection benchmarks and generalizes well on multiple VLM architectures. Furthermore, our method NegLabel exhibits remarkable robustness against diverse domain shifts. In open-world scenarios, deploying machine learning models faces a critical challenge: how to handle data from unknown classes, commonly referred to as out-of-distribution (OOD) data (Hendrycks & Gimpel, 2017). The presence of OOD data can lead to models exhibiting overconfidence, potentially resulting in severe errors or security risks. This issue is particularly pronounced in critical applications, such as autonomous vehicles and medical diagnosis. Therefore, detecting and rejecting OOD data plays a crucial role in ensuring the reliability and safety of the model. Traditional visual OOD detection methods (Hsu et al., 2020a; Wang et al., 2021b; Huang et al., 2021; Sun et al., 2021; Wang et al., 2021a) typically rely solely on image information, ignoring the rich textual information carried by labels. Vision-language models (VLMs) can leverage multimodal information, which is also beneficial for OOD detection. Some recently proposed methods attempt to design dedicated OOD detectors for VLMs. Specifically, ZOC (Esmaeilpour et al., 2022) defines the new task - zero-shot OOD detection, and uses a trainable captioner to generate candidate OOD labels to match OOD images. However, when dealing with large-scale datasets encompassing a multitude of in-distribution (ID) classes, like ImageNet-1k, the captioner may not generate effective candidate OOD labels, resulting in poor performance. MCM (Ming et al., 2022a) uses the maximum logit of scaled softmax to identify OOD images. However, MCM only employs information from the ID label space and does not effectively exploit the text interpretation capabilities of VLMs.